home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Software Vault: The Gold Collection
/
Software Vault - The Gold Collection (American Databankers) (1993).ISO
/
cdr25
/
isp3009b.zip
/
ISPELL1.DOC
< prev
next >
Wrap
Text File
|
1993-03-04
|
28KB
|
593 lines
local ISPELL(1)
NAME
ispell, buildhash - Interactive spelling checking
SYNOPSIS
ispell [common-flags] [-M|-N] [-Lcontext] [-V] files
ispell [common-flags] -l
ispell [common-flags] [-f file] [-s] {-a|-A}
ispell [-d file] [-w chars] -c
ispell [-d file] [-w chars] -e[e]
ispell [-d file] -D
ispell -v[v]
common-flags:
[-t] [-n] [-b] [-x] [-B] [-C] [-P] [-m] [-S] [-d file] [-p file]
[-w chars] [-W n] [-T type]
buildhash [-s] dict-file affix-file hash-file
buildhash -c count affix-file
DESCRIPTION
Ispell is fashioned after the spell program from ITS (called ispell on
Twenex systems.) The most common usage is "ispell filename". In this
case, ispell will display each word which does not appear in the dic-
tionary at the top of the screen and allow you to change it. If there
are "near misses" in the dictionary (words which differ by only a single
letter, a missing or extra letter, a pair of transposed letters, or a
missing space or hyphen), then they are also displayed on following
lines. As well as "near misses", ispell may display other guesses at
ways to make the word from a known root, with each guess preceded by
question marks. Finally, the line containing the word and the previous
line are printed at the bottom of the screen. If your terminal can
display in reverse video, the word itself is highlighted. You have the
option of replacing the word completely, or choosing one of the sug-
gested words. Commands are single characters as follows (case is
ignored):
R Replace the misspelled word completely.
Space
Accept the word this time only.
A Accept the word for the rest of this ispell session.
I Accept the word, capitalized as it is in the file, and update
private dictionary.
U Accept the word, and add an uncapitalized (actually, all lower-
case) version to the private dictionary.
0-n
Replace with one of the suggested words.
L Look up words in system dictionary (controlled by the WORDS com-
pilation option). *NOT* Currently available under OS/2.
X Write the rest of this file, ignoring misspellings, and start
next file.
Q Exit immediately and leave the file unchanged.
! Shell escape.
^L
Redraw screen.
^Z
Suspend ispell. Currently not available under OS/2 2.0
? Give help screen.
If the -M switch is specified, a one-line mini-menu at the bottom of the
screen will summarize these options. Conversely, the -N switch may be
used to suppress the mini-menu. (The minimenu is displayed by default
if ispell was compiled with the MINIMENU option, but these two switches
will always override the default).
If the -L flag is given, the specified number is used as the number of
lines of context to be shown at the bottom of the screen (The default is
to calculate the amount of context as a certain percentage of the screen
size). The amount of context is subject to a system-imposed limit.
If the -V flag is given, characters that are not in the 7-bit ANSI
printable character set will always be displayed in the style of "cat
-v", even if ispell thinks that these characters are legal ISO Latin-1
on your system. This is useful when working with older terminals.
Without this switch, ispell will display 8-bit characters "as is" if
they have been defined as string characters for the chosen file type.
"Normal" mode, as well as the -l, -a, and -A options (see below) also
accepts the following "common" flags on the command line:
-t
The input file is in TeX or LaTeX format.
-n
The input file is in nroff/troff format.
-b
Create a backup file by appending ".bak" to the name of the input
file.
-x
Don't create a backup file.
-B
Report run-together words with missing blanks as spelling errors.
-C
Consider run-together words as legal compounds.
-P
Don't generate extra root/affix combinations.
-m
Make possible root/affix combinations that aren't in the diction-
ary.
-S
Sort the list of guesses by probable correctness.
-d file
Specify an alternate dictionary file.
-p file
Specify an alternate personal dictionary.
-w chars
Specify additional characters that can be part of a word.
-W n
Specify length of words that are always legal.
-T type
Assume a given formatter type for all files.
All files specified on the command line must use forward slashes (/) for
directory separators.
The -n and -t options select whether ispell runs in nroff/troff (-n) or
TeX/LaTeX (-t) input mode. (The default is controlled by the DEFTEXFLAG
installation option.) TeX/LaTeX mode is also automatically selected if
an input file has the extension ".tex". In TeX/LaTeX mode, whenever a
backslash ("\") is found, ispell will skip to the next whitespace or
TeX/LaTeX delimiter. Certain commands contain arguments which should
not be checked, such as labels and reference keys as are found in the
\cite command, since they contain arbitrary, non-word arguments. Spell
checking is also suppressed when in math mode. Thus, for example, given
\chapter {This is a Ckapter} \cite{SCH86}
ispell will find "Ckapter" but not "SCH". The -t option does not recog-
nize the TeX comment character "%", so comments are also spell-checked.
It also assumes correct LaTeX syntax. Arguments to infrequently used
commands and some optional arguments are sometimes checked
unnecessarily. The bibliography will not be checked if ispell was com-
piled with IGNOREBIB defined. Otherwise, the bibliography will be
checked but the reference key will not.
References for the tib(1) bibliography system, that is, text between a
``[.'' or ``<.'' and ``.]'' or ``.>'' will always be ignored in
TeX/LaTeX mode.
The -b and -x options control whether ispell leaves a backup (.bak) file
for each input file. The .bak file contains the pre-corrected text. If
there are file opening / writing errors, the .bak file may be left for
recovery purposes even with the -x option. The default for this option
is controlled by the DEFNOBACKUPFLAG installation option.
The -B and -C options control how ispell handles run-together words,
such as "notthe" for "not the". If -B is specified, such words will be
considered as errors, and ispell will list variations with an inserted
blank or hyphen as possible replacements. If -C is specified, run-
together words will be considered to be legal compounds as long as both
components are in the dictionary, and neither component is a single
character. This is useful for languages such as German and Norwegian,
where many compound words are formed by concatenation. (Note that com-
pounds formed from three or more two root words will still be considered
errors). The default for this option is language-dependent; in a
multi-lingual installation the default may vary depending on which dic-
tionary you choose.
The -P and -m options control when ispell automatically generates sug-
gested root/affix combinations for possible addition to your personal
dictionary. (These are the entries in the "guess" list which are pre-
ceded by question marks.) If -P is specified, such guesses are displayed
only if ispell cannot generate any possibilities that match the current
dictionary. If -m is specified, such guesses are always displayed.
This can be useful if the dictionary has a limited word list, or a word
list with few suffixes. However, you should be careful when using this
option, as it can generate guesses that produce illegal words. The
default for this option is controlled by the dictionary file used.
The -S option suppresses ispell's normal behavior of sorting the list of
possible replacement words. Some people may prefer this, since it some-
what enhances the probability that the correct word will be low-
numbered.
The -d option is used to specify an alternate hashed dictionary file,
other than the value specified in the configuration file by the
ISDEFDICT keyword or the default. If the filename does not contain a
"/", the directory containing ispell.exe is searched; thus, to use a
dictionary in the local directory "-d ./xxx.hash" must be used. This
is useful to allow dictionaries for alternate languages. Unlike
previous versions of ispell, a dictionary of /dev/null is illegal,
because the dictionary contains the affix table. If you need an
effectively empty dictionary, create a one-entry list with an unlikely
string (e.g., "qqqqq").
The -p option is used to specify an alternate personal dictionary
file. If the file name is not an absolute path, the current directory
is searched. If the file is not found in the current directory, the
directory containing ispell.exe is searched. The shell variable
ISWORDLIST may be set, which renames the personal dictionary in the
same manner. The keyword ISDEFPDICT may also be set in ispell.cfg.
The environment variable will overide the setting in the configuration
file. The command line overrides any ISWORDLIST or configuration file
settings. If you specify one of the default hash-files from the
library dictionary and the file "is_hashfile" exists, ispell will use
this file as the personal dictionary. This algorithm may not work
properly on FAT systems, because it is possible that the base portion
of the derived filename will be longer than 8 characters. If none of
these conditions are met, the file "is_words" is used.
If the -p option is not specified and the ISDEFPDICT keyword is not set,
ispell will look for personal dictionaries in both the current directory
and the directory where isepll.exe resides. If dictionaries exist in
both places, they will be merged. If any words are added to the
personal dictionary, they will be written to the current directory if a
dictionary already existed in that place; otherwise they will be
written to the dictionary in the directory where ispell.exe resides.
You are advised to create the file "~/is_hashfile" if you want to
use ispell on several different languages. For example, in Norway the
file "is_norsk" is used as the personal Norwegian dictionary and
the file "is_english" or "is_words" for the personal English dictionary.
Remember that the filenames must conform to your filesystem (FAT or HPFS).
The -w option may be used to specify characters other than alphabetics
which may also appear in words. For instance, -w "&" will allow "AT&T"
to be picked up. Underscores are useful in many technical documents.
There is an admittedly crude provision in this option for 8-bit interna-
tional characters. Non-printing characters may be specified in the
usual way by inserting a backslash followed by the octal character code;
e.g., "\014" for a form feed. Alternatively, if "n" appears in the
character string, the (up to) three characters following are a DECIMAL
code 0 - 255, for the character. For example, to include bells and form
feeds in your words (an admittedly silly thing to do, but aren't most
pedagogical examples):
n007n012
Numeric digits other than the three following "n" are simply numeric
characters. Use of "n" does not conflict with anything because actual
alphabetics have no meaning - alphabetics are already accepted. Ispell
will typically be used with input from a file, meaning that preserving
parity for possible 8 bit characters from the input text is OK. If you
specify the -l option, and actually type text from the terminal, this
may create problems if your stty settings preserve parity.
The -W option may be used to change the length of words that ispell
always accepts as legal. Normally, ispell will accept all 1-character
words as legal, which is equivalent to specifying "-W 1." (The default
for this switch is actually controlled by the MINWORD installation
option, so it may vary at your installation.) If you want all words to
be checked against the dictionary, regardless of length, you might want
to specify "-W 0." On the other hand, if your document specifies a lot
of three-letter acronyms, you would specify "-W 3" to accept all words
of three letters or less. Regardless of the setting of this option,
ispell will only generate words that are in the dictionary as suggested
replacements for words; this prevents the list from becoming too long.
Obviously, this option can be very dangerous, since short misspellings
may be missed. If you use this option a lot, you should probably make a
last pass without it before you publish your document, to protect your-
self against errors.
The -T option is used to specify a default formatter type for use in
generating string characters. This switch overrides the default type
determined from the file name. If no -T option appears and no type can
be determined from the file name, the default string character type
declared in the language affix file will be used.
The -l or "list" option to ispell is used to produce a list of
misspelled words from the standard input.
The -a option is intended to be used from other programs through a pipe.
In this mode, ispell prints a one-line version identification message,
and then begins reading lines of input. For each input line, a single
line is written to the standard output for each word checked for spel-
ling on the line. If the word was found in the main dictionary, or your
personal dictionary, then the line contains only a '*'. If the word was
found through affix removal, then the line contains a '+', a space, and
the root word. If the word was found through compound formation (con-
catenation of two words, controlled by the -C option), then the line
contains only a '-'.
If the word is not in the dictionary, but there are near misses, then
the line contains an '&', a space, the misspelled word, a space, the
number of near misses, the number of characters between the beginning of
the line and the beginning of the misspelled word, a colon, another
space, and a list of the near misses separated by commas and spaces.
Following the near misses (and identified only by the count of near
misses), if the word could be formed by adding (illegal) affixes to a
known root, is a list of suggested derivations, again separated by com-
mas and spaces. If there are no near misses at all, the line format is
the same, except that the '&' is replaced by '?' (and the near-miss
count is always zero). The suggested derivations following the near
misses are in the form:
[prefix+] root [-prefix] [-suffix] [+suffix]
(e.g., "re+fry-y+ies" to get "refries") where each optional pfx and sfx
is a string. Also, each near miss or guess is capitalized the same as
the input word unless such capitalization is illegal; in the latter case
each near miss is capitalized correctly according to the dictionary.
Finally, if the word does not appear in the dictionary, and there are no
near misses, then the line contains a '#', a space, the misspelled word,
a space, and the character offset from the beginning of the line. Each
sentence of text input is terminated with an additional blank line,
indicating that ispell has completed processing the input line.
These output lines can be summarized as follows:
OK:
*
Root:
+ <root>
Compound:
-
Miss:
& <original> <count> <offset>: <miss>, <miss>, ..., <guess>, ...
Guess:
? <original> 0 <offset>: <guess>, <guess>, ...
None:
# <original> <offset>
For example, a dummy dictionary containing the words "fray", "Frey",
"fry", and "refried" might produce the following response to the command
"echo 'frqy refries | ispell -a -m -d ./test.hash":
(#) International Ispell Version 3.0.05 (beta), 08/10/91
& frqy 3 0: fray, Frey, fry
& refries 1 5: refried, re+fry-y+ies
This mode is also suitable for interactive use when you want to figure
out the spelling of a single word.
The -A option works just like -a, except that if a line begins with the
string "&Include_File&", the rest of the line is taken as the name of a
file to read for further words. Input returns to the original file when
the include file is exhausted. Inclusion may be nested up to five deep.
The key string may be changed with the environment variable
INCLUDE_STRING (the ampersands, if any, must be included).
When in the -a mode, ispell will also accept lines of single words pre-
fixed with any of '*', '@', '+', '-', '~', '#', '!', '%', or '^'. A
line starting with '*' tells ispell to insert the word into the user's
dictionary (similar to the I command). A line starting with '@' causes
ispell to accept this word in the future (similar to the A command). A
line prefixed with a '+' will place ispell in TeX/LaTeX mode (similar to
the -t option) and '-' returns ispell to nroff/troff mode. A line
starting with '~' causes ispell to set internal parameters (in particu-
lar, the default string character type) based on the filename given in
the rest of the line. A line prefixed with '#' will cause the personal
dictionary to be saved. A line prefixed with '!' will turn on terse
mode (see below), and a line prefixed with '%' will return ispell to
normal (non-terse) mode. Any input following the prefix characters '+',
'-', '~', '#', '!', or '%' is ignored. To allow spell-checking of lines
beginning with these characters, a line starting with '^' has that char-
acter removed before it is passed to the spell-checking code. It is
recommended that programmatic interfaces prefix every data line with an
uparrow to protect themselves against future changes in ispell.
To summarize these:
* Add to personal dictionary
@ Accept word, but leave out of dictionary
# Save current personal dictionary
~ Set parameters based on filename
+ Enter TeX mode
- Exit TeX mode
! Enter terse mode
% Exit terse mode
^ Spell-check rest of line
In terse mode, ispell will not print lines beginning with '*', '+', or
'-', all of which indicate correct words. This significantly improves
running speed when the driving program is going to ignore correct words
anyway.
The -s option is only valid in conjunction with the -a or -A options,
and only on BSD-derived systems. If specified, ispell will stop itself
with a SIGTSTP signal after each line of input. It will not read more
input until it receives a SIGCONT signal. This may be useful for
handshaking with certain text editors.
The -f option is only valid in conjunction with the -a or -A options.
If -f is specified, ispell will write its results to the given file,
rather than to standard output.
The -v option causes ispell to print its current version identification
on the standard output and exit. If the switch is doubled, ispell will
also print the options that it was compiled with.
The -c, -e[e], and -D options of ispell, and the -c option of buildhash,
are primarily intended for use by the munchlist shell script. The -c
switch causes a list of words to be read from the standard input. For
each word, a list of possible root words and affixes will be written to
the standard output. Some of the root words will be illegal and must be
filtered from the output by other means; the munchlist script does this.
As an example, the command:
echo BOTHER | ispell -c
produces:
BOTHER BOTHE/R BOTH/R
The -e switch is the reverse of -c; it expands affix flags to produce a
list of words. For example, the command:
echo BOTH/R | ispell -e
produces:
BOTH BOTHER
If the -e switch is doubled (-ee), then the original line is output as
well:
BOTH/R BOTH BOTHER
Finally, the -D flag causes the affix tables from the dictionary file to
be dumped to standard output.
Unless it has been installed without the CAPITALIZATION option by your
system administrator, ispell is aware of the correct capitalizations of
words in the dictionary and in your personal dictionary. As well as
recognizing words that must be capitalized (e.g., George) and words that
must be all-capitals (e.g., NASA), it can also handle words with
"unusual" capitalization (e.g., "ITCorp" or "TeX"). If a word is capi-
talized incorrectly, the list of possibilities will include all accept-
able capitalizations. (More than one capitalization may be acceptable;
for example, my dictionary lists both "ITCorp" and "ITcorp".)
Normally, this feature will not cause you surprises, but there is one
circumstance you need to be aware of. If you use "I" to add a word to
your dictionary that is at the beginning of a sentence (e.g., the first
word of this paragraph if "normally" were not in the dictionary), it
will be marked as "capitalization required". A subsequent usage of this
word without capitalization (e.g., the quoted word in the previous sen-
tence) will be considered a misspelling by ispell, and it will suggest
the capitalized version. You must then compare the actual spellings by
eye, and then type "I" to add the uncapitalized variant to your personal
dictionary. You can avoid this problem by using "U" to add the original
word, rather than "I".
The rules for capitalization are as follows:
(1) Any word may appear in all capitals, as in headings.
(2) Any word that is in the dictionary in all-lowercase form may appear
either in lowercase or capitalized (as at the beginning of a sen-
tence).
(3) Any word that has "funny" capitalization (i.e., it contains both
cases and there is an uppercase character besides the first) must
appear exactly as in the dictionary, except as permitted by rule
(1). If the word is acceptable in all-lowercase, it must appear
thus in a dictionary entry.
buildhash
The buildhash program builds hashed dictionary files for later use by
ispell. The raw word list (with affix flags) is given in dict-file, and
the the affix flags are defined by affix-file. The hashed output is
written to hash-file. The formats of the two input files are described
in ispell4.doc. The -s (silent) option suppresses the usual status mes-
sages that are written to the standard error device.
The -c option to buildhash is used to combine various affixes for the
same root word. The affixes are defined in affix-file. The count given
with the -c option is an approximation of the number of root words in
the dictionary; an accurate count is not necessary but speeds the opera-
tion. Roots and affixes are read from standard input in the normal dic-
tionary format; they need not be sorted. All affixes for a given root
will be combined, and the resulting word list will be written (in an
essentially random order unrelated to the input order) to standard out-
put. For example, the command:
buildhash -c 3 english.aff << DONE brother/S sister/S sister/M
brother/M love/S love/G love/R DONE
produces:
brother/MS
sister/MS
love/GRS
(though not necessarily in that order).
It is possible to install ispell in such a way as to only support ASCII
range text if desired.
ENVIRONMENT
ISWORDLIST
Personal dictionary file name
INCLUDE_STRING
Code for file inclusion under the -A option
DEFAULT FILES
ISPELL_DIR/engmed.hash
Hashed dictionary (may be found in some other local directory,
depending on the system).
ISPELL_DIR/english.aff
Affix-definition file for munchlist
ISPELL_DIR/.is_words
User's private dictionary
./is_words
Directory-specific private dictionary
ISPELL_DIR is the directory where ispell.exe resides.
SEE ALSO
ispell4.doc, english4.doc
BUGS
It takes several to many seconds for ispell to read in the hash table,
depending on size.
When all options are enabled, ispell may take several seconds to gen-
erate all the guesses at corrections for a misspelled word; on slower
machines this time is long enough to be annoying.
The hash table is stored as a quarter-megabyte (or larger) array, so a
PDP-11 or 286 version does not seem likely.
Ispell should understand more troff syntax, and deal more intelligently
with contractions.
Although small personal dictionaries are sorted before they are written
out, the order of capitalizations of the same word is somewhat random.
When the -x flag is specified, ispell will unlink any existing .bak
file.
There are too many flags, and many of them have non-mnemonic names.
AUTHOR (unix version)
Pace Willisson (pace@mit-vax), 1983, based on the PDP-10 assembly ver-
sion. That version was written by R. E. Gorin in 1971, and later
revised by W. E. Matson (1974) and W. B. Ackerman (1978). Collected,
revised, and enhanced for the Usenet by Walt Buehring, 1987. Table-
driven multi-lingual version by Geoff Kuenning, 1987-88. Large dic-
tionaries provided by Bob Devine (vianet!devine). A complete list of
contributors is too large to list here, but is distributed with the
ispell sources in the file "Contributors".
VERSION
The version of ispell described by this manual page is International
Ispell Version 3.0.06 (beta), 09/17/91.
OS/2 PORT
This port of Ispell was done by Joe Huber (jbhuber@iastate.edu)
during 9/92. This man page is based on the man page included in the
original Ispell distribution and has been modified to reflect the
changes in the OS/2 version of Ispell.